Chapter 20

Getting the Hint from Epidemiologic

Inference

IN THIS CHAPTER

Choosing potential confounders for your regression model

Using a modeling approach to develop a final model

Adding interactions to the final model

Interpreting the final model for causal inference

In Parts 5 and 6, we describe different types of regression, such as ordinary least-squares regression,

logistic regression, Poisson regression, and survival regression. In each kind of regression we cover,

we describe a situation in which you are performing multivariable or multivariate regression, which

means you are making a regression model with more than one independent variable. Those chapters

describe the mechanics of fitting these multivariable models, but they don’t provide much guidance on

which independent variables to choose to try to put in the multivariable model.

The chapters in Parts 5 and 6 also discuss model-fitting, which means the act of trying to refine your

regression model so that it optimally fits your data. When you have a lot of candidate independent

variables (or candidate covariates), part of model-fitting has to do with deciding which of these

variables actually fit in the model and should stay in, and which ones don’t fit and should be kicked

out. Part of what guides this decision-making process are the mechanics of modeling and model-fitting.

The other main part of what guides these decisions is the hypothesis you are trying to answer with your

model, which is the focus of this chapter.

In this chapter, we revisit the concept of confounding from Chapter 7 and explain how to choose

candidate covariates for your regression model. We also discuss modeling approaches and explain

how to add interaction terms to your final model.

Staying Clearheaded about Confounding

Chapter 7 discusses study design and terminology in epidemiology. As a reminder, in epidemiology,

exposure refers to a factor you hypothesize to cause a disease (or outcome). In your regression model,

the outcome is the dependent variable. The exposure will be one of the covariates in your model. But

what other covariates belong in the model? How do you decide on a collection of candidate-

independent variables that you would even consider putting in a model with the exposure? The answer

is that you choose them on the basis of their status as a potential confounder.

A confounder is a factor that meets these three criteria: